A comparison of pronunciation modeling approaches for HMM-TTS

نویسندگان

Gabriel Webster

Sacha Krstulovic

Kate Knill

چکیده

Hidden Markov model-based text-to-speech (HMM-TTS) systems are often trained on manual voice corpus phonetic transcriptions, despite the fact that because these manual pronunciations cannot be predicted with complete accuracy at synthesis time, the result is training/synthesis mismatch. In this paper, an alternate approach is proposed in which a set of manually written post-lexical effects (PLE) rules modeling a range of continuous speech effects are applied to canonical lexicon pronunciations, and the resulting matched PLE phone sequences are used both in the voice corpus markup and at synthesis time. For a US English system, a subjective evaluation showed that a system trained on matched PLE markup and a system trained on manual phone markup were equally preferred, suggesting that it may be possible to replace manual pronunciations with matched PLE pronunciations, dramatically decreasing the time and cost required to produce an HMM-TTS voice.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker adaptation using a parallel phone set pronunciation dictionary for Thai-English bilingual TTS

This paper develops a bilingual Thai-English TTS system from two monolingual HMM-based TTS systems. An English Nagoya HMM-based TTS system (HTS) provides correct pronunciations of English words but the voice is different from the voice in a Thai HTS system. We apply a CSMAPLR adaptation technique to make the English voice sounds more similar to the Thai voice. To overcome a phone mapping proble...

متن کامل

Comparison of chironomic stylization versus statistical modeling of prosody for expressive speech synthesis

Chironomic stylization is the process of real-time modification of intonation contours (f0 and tempo) using drawing/writing gestures with a stylus on a graphic tablet. The question addressed in this research is whether hand-made intonation stylization could improve or degrade expressivity and overall quality, compared to statistical modeling of prosody. A system for expressive TTS in French bas...

متن کامل

Syllable HMM based Mandarin TTS and comparison with concatenative TTS

This paper introduces a Syllable HMM based Mandarin TTS system. 10-state left-to-right HMMs are used to model each syllable. We leverage the corpus and the front end of a concatenative TTS system to build the Syllable HMM based TTS system. Furthermore, we utilize the unique consonant/vowel structure of Mandarin syllable to improve the voiced/unvoiced decision of HMM states. Evaluation results s...

متن کامل

Pronunciation lexicon adaptation for TTS voice building

This paper describes reducing phone label errors in TTS voice building by means of modeling of speaker pronunciation variants. Each speaker has his or her own unique pronunciations (and context-dependent variations), so that no one standard lexicon is able to cover all of the speaker’s variations. Creating speaker-dependent pronunciation lexicons for automatic speech labeling of our TTS voice d...

متن کامل

EFL Pronunciation Teaching: A Theoretical Review

This study aims to represent the developing status of pronunciation teaching and presents the current perspectives on pronunciation learning and teaching, coupled with innovative approaches and techniques/activities. It is argued that pronunciation teaching methodologies have changed over decades since the Reform Movement. The exact status of teaching pronunciation appeared first in the Audio L...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

A comparison of pronunciation modeling approaches for HMM-TTS

نویسندگان

چکیده

منابع مشابه

Speaker adaptation using a parallel phone set pronunciation dictionary for Thai-English bilingual TTS

Comparison of chironomic stylization versus statistical modeling of prosody for expressive speech synthesis

Syllable HMM based Mandarin TTS and comparison with concatenative TTS

Pronunciation lexicon adaptation for TTS voice building

EFL Pronunciation Teaching: A Theoretical Review

عنوان ژورنال:

اشتراک گذاری